k-class estimator
Distributionally Robust Instrumental Variables Estimation
Instrumental variables (IV) estimation, also known as IV regression, is a fundamental method in econometrics and statistics to infer causal relationships in observational data with unobserved confounding. It leverages access to additional variables (instruments) that affect the outcome exogenously and exclusively through the endogenous regressor to yield consistent causal estimates, even when the standard ordinary least squares (OLS) estimator is biased by unobserved confounding (Imbens and Angrist, 1994; Angrist et al., 1996; Imbens and Rubin, 2015). Over the years, IV estimation has become an indispensable tool for causal inference in empirical works in economics (Card and Krueger, 1994), as well as in the study of genetic and epidemiological data (Davey Smith and Ebrahim, 2003). Despite the widespread use of IV in empirical and applied works, it has important limitations and challenges, such as invalid instruments (Sargan, 1958; Murray, 2006), weak instruments (Staiger and Stock, 1997), non-compliance (Imbens and Angrist, 1994), and heteroskedasticity, especially in settings with weak instruments or highly leveraged datasets (Andrews et al., 2019; Young, 2022). These issues could significantly impact the validity and quality of estimation and inference using instrumental variables (Jiang, 2017). Many works have since been devoted to assessing and addressing these issues, such as statistical tests (Hansen, 1982; Stock and Yogo, 2002), sensitivity analysis (Rosenbaum and Rubin, 1983; Bonhomme and Weidner, 2022), and additional assumptions or structures on the data generating process (Kolesár et al., 2015; Kang et al., 2016; Guo et al., 2018b). Recently, an emerging line of works have highlighted interesting connections between causality and the concepts of invariance and robustness (Peters et al., 2016; Meinshausen, 2018; Rothenhäusler et al., 2021; Bühlmann, 2020; Jakobsen and Peters, 2022; Fan et al., 2024). Their guiding philosophy is that causal properties can be viewed as robustness against changes across heterogeneous environments, represented by a set P of data distributions.
- North America > United States > Pennsylvania (0.04)
- North America > United States > New Jersey (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (4 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.67)
Distributional Robustness of K-class Estimators and the PULSE
Jakobsen, Martin Emil, Peters, Jonas
Recently, in causal discovery, invariance properties such as the moment criterion which two-stage least square estimator leverage have been exploited for causal structure learning: e.g., in cases, where the causal parameter is not identifiable, some structure of the non-zero components may be identified, and coverage guarantees are available. Subsequently, anchor regression has been proposed to trade-off invariance and predictability. The resulting estimator is shown to have optimal predictive performance under bounded shift interventions. In this paper, we show that the concepts of anchor regression and K-class estimators are closely related. Establishing this connection comes with two benefits: (1) It enables us to prove robustness properties for existing K-class estimators when considering distributional shifts. And, (2), we propose a novel estimator in instrumental variable settings by minimizing the mean squared prediction error subject to the constraint that the estimator lies in an asymptotically valid confidence region of the causal parameter. We call this estimator PULSE (p-uncorrelated least squares estimator) and show that it can be computed efficiently, even though the underlying optimization problem is non-convex. We further prove that it is consistent. We perform simulation experiments illustrating that there are several settings including weak instrument settings, where PULSE outperforms other estimators and suffers from less variability.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > New York > New York County > New York City (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- (6 more...)